Supplementary File 1 Additional GPU-powered Tools for Bioinformatics and Computational Biology
نویسندگان
چکیده
A field of Computational Biology where GPU acceleration can yield a relevant speed-up is related to the analysis of spectral data derived from, e.g., mass-spectrometry experiments. FastPaSS [4] is a tool that accelerates the identification of a spectrum in a spectral library by means of the SpectraST similarity scoring algorithm [14]. The core of the matching algorithm is represented by a dot product of two vectors corresponding to the normalized intensities of the spectra, a calculation that is well-suitable for GPU acceleration. According to the results shown in [4], using a Nvidia GeForce 8600 GTS, FastPaSS allows a 8× speed-up with respect to a sequential execution optimized using a pre-caching of files. Tempest [19] extends the task of spectral matching by performing the whole similarity scoring— including the generation of theoretical fragmented spectra—directly on the GPU. Tempest exploits the possibility of asynchronous execution of CUDA kernels to implement a heterogeneous execution scheme, in which the CPU performs database digestion while the GPU performs the scoring. Tempest implements two scoring functions: the first based on cross-correlation, and the second on dot product. According to [19], Tempest’s accelerated scoring function based on the dot product allows a two orders of magnitude speed-up with respect to the correlation-based method. Still, the approach based on correlation is more accurate and Tempest’s speed-up is about 10× with respect to the CPU, using a Nvidia GeForce GTX 480. The acceleration of similarity scoring by means of GPUs was also investigated in [16]. Here, it was shown that the spectral dot product can be strongly accelerated by intensively exploiting the shared memory and parallel reduction techniques, with a two-order magnitude speed-up using a Nvidia GeForce GTX 280. A different example of GPU-powered spectral analysis concerns feature detection, used to identify and quantify the proteins contained in a sample; in particular, features must be separated from signal noise and baseline artifacts. Hussong et al. [11] employed an adaptive wavelet transform for this task, a process that can be parallelized on the GPU. Their implementation exploits the shared memory to store the spectrum, allowing a 200× speed-up on real world data sets. Since the methodology was implemented on an early CUDA architecture (namely, Nvidia Tesla C870), the amount of available memory was limited to 16 KB (see Table 2 in Supplementary File 2). Thus, for larger signals, this tool automatically switches to texture memory, affecting the speed-up.
منابع مشابه
GPU-powered model analysis with PySB/cupSODA
Summary A major barrier to the practical utilization of large, complex models of biochemical systems is the lack of open-source computational tools to evaluate model behaviors over high-dimensional parameter spaces. This is due to the high computational expense of performing thousands to millions of model simulations required for statistical analysis. To address this need, we have implemented a...
متن کاملOpenStructure: a flexible software framework for computational structural biology
MOTIVATION Developers of new methods in computational structural biology are often hampered in their research by incompatible software tools and non-standardized data formats. To address this problem, we have developed OpenStructure as a modular open source platform to provide a powerful, yet flexible general working environment for structural bioinformatics. OpenStructure consists primarily of...
متن کاملMendel-GPU: haplotyping and genotype imputation on graphics processing units
MOTIVATION In modern sequencing studies, one can improve the confidence of genotype calls by phasing haplotypes using information from an external reference panel of fully typed unrelated individuals. However, the computational demands are so high that they prohibit researchers with limited computational resources from haplotyping large-scale sequence data. RESULTS Our graphics processing uni...
متن کاملSTOCHSIMGPU: parallel stochastic simulation for the Systems Biology Toolbox 2 for MATLAB
MOTIVATION The importance of stochasticity in biological systems is becoming increasingly recognized and the computational cost of biologically realistic stochastic simulations urgently requires development of efficient software. We present a new software tool STOCHSIMGPU that exploits graphics processing units (GPUs) for parallel stochastic simulations of biological/chemical reaction systems a...
متن کاملSystems biology Biological Dynamics Markup Language (BDML): an open format for representing quantitative biological dynamics data
Motivation: Recent progress in live-cell imaging and modeling techniques has resulted in generation of a large amount of quantitative data (from experimental measurements and computer simulations) on spatiotemporal dynamics of biological objects such as molecules, cells and organisms. Although many research groups have independently dedicated their efforts to developing software tools for visua...
متن کامل